13 research outputs found

    Case Study on Human-Robot Interaction of the Remote-Controlled Service Robot for Elderly and Disabled Care

    Get PDF
    The tendency of continuous aging of the population and the increasing number of people with mobility difficulties leads to increased research in the field of Assistive Service Robotics. These robots can help with daily life tasks such as reminding to take medications, serving food and drinks, controlling home appliances and even monitoring health status. When talking about assisting people in their homes, it should be noted that they will, most of the time, have to communicate with the robot themselves and be able to manage it so that they can get the most out of the robot's services. This research is focused on different methods of remote control of a mobile robot equipped with robotic manipulator. The research investigates in detail methods based on control via gestures, voice commands, and web-based graphical user interface. The capabilities of these methods for Human-Robot Interaction (HRI) have been explored in terms of usability. In this paper, we introduce a new version of the robot Robco 19, new leap motion sensor control of the robot and a new multi-channel control system. The paper presents methodology for performing the HRI experiments from human perception and summarizes the results in applications of the investigated remote control methods in real life scenarios

    Building of Broadcast News Database for Evaluation of the Automated Subtitling Service

    No full text
    This paper describes the process of recording, annotation, correction and evaluation of the new Broadcast News (BN) speech database named KEMT-BN2, as an extension for our older KEMT-BN1 and COST-278 databases used for automatic Slovak continuous speech recognition development. The database utilisation and statistics are presented. This database was prepared for evaluation of the automated BN transcription system, developed in our laboratory, which is mainly used for subtitle generation for recorded BN shows. The speech database is the key part of the acoustic models training for specific domains and also for speaker and anchor adapted models creation

    Speaker Recognition for Surveillance Application

    No full text
    This paper describes the speaker recognition problem regarding to the complex surveillance system. The proposed system extension enables identified the precise identity or at least the gender of the suspect by the captured voice analysis. Our solution is based on the text-independent approach by using Mel-Frequency Cepstral coefficients and fundamental frequency for extracting the identity from a voice signal. Gaussian Mixture Models up to 1024 mixtures were used to classify more than 20 speakers. In this paper the comparison and evaluation of speech based parametrizations and noise elimination techniques are presented regarding to the noisy acoustic data. This system extension could help to eliminate the vandalism and to increase the elucidation of crimes

    Methodology for Training Small Domain-specific Language Models and Its Application in Service Robot Speech Interface

    No full text
    The proposed paper introduces the novel methodology for training small domain-specific language models only from domain vocabulary. Proposed methodology is intended for situations, when no training data are available and preparing of appropriate deterministic grammar is not trivial task. Methodology consists of two phases. In the first phase the “random” deterministic grammar, which enables to generate all possible combination of unigrams and bigrams is constructed from vocabulary. Then, prepared random grammar serves for generating the training corpus. The “random” n-gram model is trained from generated corpus, which can be adapted in second phase. Evaluation of proposed approach has shown usability of the methodology for small domains. Results of methodology assessment favor designed method instead of constructing the appropriate deterministic grammar

    Feature Selection for Audio Surveillance in Urban Environment

    No full text
    This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass) in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK), then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found

    Slovak Dataset for Multilingual Question Answering

    No full text
    SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It consists of more than 91k factual questions and answers from various fields. Each question has an answer marked in the corresponding paragraph. It also contains negative examples in the form of “unanswered questions” and “plausible answers”. The dataset is published free of charge for scientific use. We aim to contribute to the creation of Slovak or multilingual systems for generating an answer to a question in a natural language. The paper provides an overview of the existing datasets for question answering. It describes the annotation process and statistically analyzes the created content. The dataset expands the possibilities of training and evaluation of multilingual language models. Experiments show that the dataset achieves state-of-the-art results for Slovak and improves question answering for other languages in zero-shot learning. We compare the effect of machine-translated data with manually annotated. Additional data improve the modeling for low-resourced languages

    Server-based Speech Technologies for Mobile Robotic Applications

    No full text
    Paper proposes the server-based technologies and the overall solution of the multimodal interface (speech and touchscreen) usable for mobileapplications in robotics as well as in other domain. The server-based automatic speech recognition server, able to handle several audio input streams, has been designed, developed and connected to the Android application. It receives input data stream and sends back the recognition result. The second important technology was designed and implemented to synthesize artificial speech. The server-based TTSsolution was prepared and connected. HMM-based approach was applied including recording and training of new voices. Finally, the simple client application for Android devices was developed and tested. Thediscussion of related problems is proposed also in the paper
    corecore